{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Building a Knowledge Graph with LlamaCloud & Neo4J\n",
    "\n",
    "RAG is as a powerful technique for enhancing LLMs with external knowledge, but traditional semantic similarity search often fails to capture nuanced relationships between entities, and can miss critical context that spans across multiple documents. By transforming unstructured documents into structured knowledge representations, we can perform complex graph traversals, relationship queries, and contextual reasoning that goes far beyond simple similarity matching.\n",
    "\n",
    "Tools like LlamaParse, LlamaClassify and LlamaExtract provide robust parsing, classification and extraction capabilities to convert raw documents into structured data, while Neo4j serves as the backbone for knowledge graph representation, forming the foundation of GraphRAG architectures that can understand not just what information exists, but how it all connects together.\n",
    "\n",
    "In this end-to-end tutorial, we will walk through an example of legal document processing that showcases the full pipeline shown below.\n",
    "\n",
    "The pipeline contains the following steps:\n",
    "- Use [LlamaClassify](https://docs.cloud.llamaindex.ai/llamaclassify/getting_started) - a LlamaCloud tool currently in beta, to classify contracts\n",
    "- Leverage [LlamaExtract](https://www.llamaindex.ai/llamaextract) to extract different sets of relevant attributes tailored to each specific contract category based on the classification\n",
    "- Store all structured information into a Neo4j knowledge graph, creating a rich, queryable representation that captures both content and intricate relationships within legal documents\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setting Up Requirements"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install llama-index-workflows llama-cloud-services jsonschema openai neo4j llama-index-llms-openai"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import re\n",
    "import uuid\n",
    "from datetime import date\n",
    "from typing import List, Optional\n",
    "\n",
    "from getpass import getpass\n",
    "from neo4j import AsyncGraphDatabase\n",
    "from openai import AsyncOpenAI\n",
    "from pydantic import BaseModel, Field\n",
    "\n",
    "from llama_cloud_services.beta.classifier.client import ClassifyClient\n",
    "from llama_cloud.types import ClassifierRule\n",
    "from llama_cloud.client import AsyncLlamaCloud\n",
    "\n",
    "from llama_cloud_services.extract import (\n",
    "    ExtractConfig,\n",
    "    ExtractMode,\n",
    "    LlamaExtract,\n",
    "    SourceText,\n",
    ")\n",
    "from llama_cloud_services.parse import LlamaParse\n",
    "\n",
    "from llama_index.llms.openai import OpenAI"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Download Sample Contract\n",
    "\n",
    "Here, we download a sample PDF from the Cuad dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!wget https://raw.githubusercontent.com/tomasonjo/blog-datasets/5e3939d10216b7663687217c1646c30eb921d92f/CybergyHoldingsInc_Affliate%20Agreement.pdf"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Set Up Neo4J\n",
    "\n",
    "For Neo4j, the simplest approach is to create a free [Aura database instance](https://neo4j.com/product/auradb/), and copy your credentials here."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "db_url = \"your-db-url\"\n",
    "username = \"neo4j\"\n",
    "password = \"your-password\"\n",
    "\n",
    "neo4j_driver = AsyncGraphDatabase.driver(\n",
    "    db_url,\n",
    "    auth=(\n",
    "        username,\n",
    "        password,\n",
    "    ),\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Parse the Contract with LlamaParse\n",
    "\n",
    "Next, we set up LlamaParse and parse the PDF. In this case, we're using `parse_page_without_llm` mode."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "os.environ[\"LLAMA_CLOUD_API_KEY\"] = getpass(\"Enter your Llama API key: \")\n",
    "os.environ[\"OPENAI_API_KEY\"] = getpass(\"Enter your OpenAI API key: \")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Started parsing the file under job_id 6687bc90-d4eb-48a5-b56a-f4bfe8f00d33\n"
     ]
    }
   ],
   "source": [
    "# Initialize parser with specified mode\n",
    "parser = LlamaParse(parse_mode=\"parse_page_without_llm\")\n",
    "\n",
    "# Define the PDF file to parse\n",
    "pdf_path = \"CybergyHoldingsInc_Affliate Agreement.pdf\"\n",
    "\n",
    "# Parse the document asynchronously\n",
    "results = await parser.aparse(pdf_path)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "    Exhibit 10.27\n",
      "\n",
      "    MARKETING AFFILIATE AGREEMENT\n",
      "\n",
      "                 Between:\n",
      "\n",
      "\n",
      "Birch First Global Investments Inc.\n",
      "\n",
      "                And\n",
      "\n",
      "\n",
      "    Mount Knowledge Holdings Inc.\n",
      "\n",
      "\n",
      "    Dated: May 8, 2014\n",
      "\n",
      "    1\n",
      "\n",
      "\n",
      "    Source: CYBERGY HOLDINGS, INC., 10-Q, 5/20/2014\n"
     ]
    }
   ],
   "source": [
    "print(results.pages[0].text)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Contract classification with LlamaClassify (beta)\n",
    "\n",
    "In this example, we want to classify incoming contracts. They can either be `Affiliate Agreements` or `Co Branding`. For this cookbook, we are going to use **[LlamaClassify](https://docs.cloud.llamaindex.ai/llamaclassify/getting_started/)**, our powerful AI-powered document classification tool that's currently in **beta**!\n",
    "\n",
    "### Define classification rules\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "rules = [\n",
    "    ClassifierRule(\n",
    "        type=\"Affiliate Agreements\",\n",
    "        description=\"This is a contract that outlines an affiliate agreement\",\n",
    "    ),\n",
    "    ClassifierRule(\n",
    "        type=\"Co Branding\",\n",
    "        description=\"This is a contract that outlines a co-branding deal/agreement\",\n",
    "    ),\n",
    "]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Initialize the ClassifyClient and Run a Classification Job"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "classifier = ClassifyClient(\n",
    "    client=AsyncLlamaCloud(\n",
    "        base_url=\"https://api.cloud.llamaindex.ai\",\n",
    "        token=os.environ[\"LLAMA_CLOUD_API_KEY\"],\n",
    "    )\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "result = await classifier.aclassify_file_path(\n",
    "    rules=rules,\n",
    "    file_input_path=\"/content/CybergyHoldingsInc_Affliate Agreement.pdf\",\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Classification Result: affiliate_agreements\n",
      "Classification Reason: The document is titled 'Marketing Affiliate Agreement' and repeatedly refers to one party as the 'Marketing Affiliate' (MA). The agreement grants the MA the right to market, advertise, and sell the company's technology, and outlines the duties and responsibilities of the affiliate in marketing, licensing, and supporting the technology. There is no mention of joint branding, shared trademarks, or collaborative marketing under both parties' brands, which would be characteristic of a co-branding agreement. The content is entirely consistent with an affiliate agreement, where one party markets and sells products or services on behalf of another, rather than a co-branding arrangement.\n"
     ]
    }
   ],
   "source": [
    "classification = result.items[0].result\n",
    "print(\"Classification Result: \" + classification.type)\n",
    "print(\"Classification Reason: \" + classification.reasoning)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setting Up LlamaExtract\n",
    "\n",
    "Next, we define some schemas which we can use to extract relevant information from our contracts with. The fields we define are a mix of summarization and structured data extraction.\n",
    "\n",
    "Here we define two Pydantic models: `Location` captures structured address information with optional fields for country, state, and address, while `Party` represents contract parties with a required name and optional location details. The Field descriptions help guide the extraction process by telling the LLM exactly what information to look for in each field."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Location(BaseModel):\n",
    "    \"\"\"Location information with structured address components.\"\"\"\n",
    "\n",
    "    country: Optional[str] = Field(None, description=\"Country\")\n",
    "    state: Optional[str] = Field(None, description=\"State or province\")\n",
    "    address: Optional[str] = Field(None, description=\"Street address or city\")\n",
    "\n",
    "\n",
    "class Party(BaseModel):\n",
    "    \"\"\"Party information with name and location.\"\"\"\n",
    "\n",
    "    name: str = Field(description=\"Party name\")\n",
    "    location: Optional[Location] = Field(\n",
    "        None, description=\"Party location details\"\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Remember we have multiple contract types, so we need to define specific extraction schemas for each type and create a mapping system to dynamically select the appropriate schema based on our classification result.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class BaseContract(BaseModel):\n",
    "    \"\"\"Base contract class with common fields.\"\"\"\n",
    "\n",
    "    parties: Optional[List[Party]] = Field(\n",
    "        None, description=\"All contracting parties\"\n",
    "    )\n",
    "    agreement_date: Optional[str] = Field(\n",
    "        None, description=\"Contract signing date. Use YYYY-MM-DD\"\n",
    "    )\n",
    "    effective_date: Optional[str] = Field(\n",
    "        None, description=\"When contract becomes effective. Use YYYY-MM-DD\"\n",
    "    )\n",
    "    expiration_date: Optional[str] = Field(\n",
    "        None, description=\"Contract expiration date. Use YYYY-MM-DD\"\n",
    "    )\n",
    "    governing_law: Optional[str] = Field(\n",
    "        None, description=\"Governing jurisdiction\"\n",
    "    )\n",
    "    termination_for_convenience: Optional[bool] = Field(\n",
    "        None, description=\"Can terminate without cause\"\n",
    "    )\n",
    "    anti_assignment: Optional[bool] = Field(\n",
    "        None, description=\"Restricts assignment to third parties\"\n",
    "    )\n",
    "    cap_on_liability: Optional[str] = Field(\n",
    "        None, description=\"Liability limit amount\"\n",
    "    )\n",
    "\n",
    "\n",
    "class AffiliateAgreement(BaseContract):\n",
    "    \"\"\"Affiliate Agreement extraction.\"\"\"\n",
    "\n",
    "    exclusivity: Optional[str] = Field(\n",
    "        None, description=\"Exclusive territory or market rights\"\n",
    "    )\n",
    "    non_compete: Optional[str] = Field(\n",
    "        None, description=\"Non-compete restrictions\"\n",
    "    )\n",
    "    revenue_profit_sharing: Optional[str] = Field(\n",
    "        None, description=\"Commission or revenue split\"\n",
    "    )\n",
    "    minimum_commitment: Optional[str] = Field(\n",
    "        None, description=\"Minimum sales targets\"\n",
    "    )\n",
    "\n",
    "\n",
    "class CoBrandingAgreement(BaseContract):\n",
    "    \"\"\"Co-Branding Agreement extraction.\"\"\"\n",
    "\n",
    "    exclusivity: Optional[str] = Field(\n",
    "        None, description=\"Exclusive co-branding rights\"\n",
    "    )\n",
    "    ip_ownership_assignment: Optional[str] = Field(\n",
    "        None, description=\"IP ownership allocation\"\n",
    "    )\n",
    "    license_grant: Optional[str] = Field(\n",
    "        None, description=\"Brand/trademark licenses\"\n",
    "    )\n",
    "    revenue_profit_sharing: Optional[str] = Field(\n",
    "        None, description=\"Revenue sharing terms\"\n",
    "    )\n",
    "\n",
    "\n",
    "mapping = {\n",
    "    \"affiliate_agreements\": AffiliateAgreement,\n",
    "    \"co_branding\": CoBrandingAgreement,\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Uploading files: 100%|██████████| 1/1 [00:00<00:00,  1.81it/s]\n",
      "Creating extraction jobs: 100%|██████████| 1/1 [00:00<00:00,  1.43it/s]\n",
      "Extracting files: 100%|██████████| 1/1 [00:12<00:00, 12.75s/it]\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "{'parties': [{'name': 'Birch First Global Investments Inc.',\n",
       "   'location': {'country': 'U.S. Virgin Islands',\n",
       "    'state': None,\n",
       "    'address': '9100 Havensight, Port of Sale, Ste. 15/16, St. Thomas, VI 0080'}},\n",
       "  {'name': 'Mount Knowledge Holdings Inc.',\n",
       "   'location': {'country': 'United States',\n",
       "    'state': 'Nevada',\n",
       "    'address': '228 Park Avenue S. #56101 New York, NY 100031502'}}],\n",
       " 'agreement_date': '2014-05-08',\n",
       " 'effective_date': '2014-05-08',\n",
       " 'expiration_date': None,\n",
       " 'governing_law': 'State of Nevada',\n",
       " 'termination_for_convenience': True,\n",
       " 'anti_assignment': True,\n",
       " 'cap_on_liability': \"Company's liability shall not exceed the fees that MA has paid under this Agreement.\",\n",
       " 'exclusivity': None,\n",
       " 'non_compete': None,\n",
       " 'revenue_profit_sharing': 'MA receives a purchase discount based on annual purchase level: 15% for $0-$100,000, 20% for $100,001-$1,000,000, 25% for $1,000,001 and above. MA pays a quarterly service fee of $15,000 if no sales occur in a quarter.',\n",
       " 'minimum_commitment': 'MA commits to purchase a minimum of 100 Units in aggregate within the Territory within the first six months of the term of this Agreement.'}"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "extractor = LlamaExtract()\n",
    "\n",
    "agent = extractor.create_agent(\n",
    "    name=f\"extraction_workflow_import_{uuid.uuid4()}\",\n",
    "    data_schema=mapping[classification.type],\n",
    "    config=ExtractConfig(\n",
    "        extraction_mode=ExtractMode.BALANCED,\n",
    "    ),\n",
    ")\n",
    "\n",
    "result = await agent.aextract(\n",
    "    files=SourceText(\n",
    "        text_content=\" \".join([el.text for el in results.pages]),\n",
    "        filename=pdf_path,\n",
    "    ),\n",
    ")\n",
    "\n",
    "result.data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Import into Neo4j\n",
    "\n",
    "The final step is to take our extracted structured information and build a knowledge graph that represents the relationships between contract entities. We need to define a graph model that specifies how our contract data should be organized as nodes and relationships in Neo4j.\n",
    "\n",
    "\n",
    "\n",
    "Our graph model consists of three main node types:\n",
    "- **Contract nodes** store the core agreement information including dates, terms, and legal clauses\n",
    "- **Party nodes** represent the contracting entities with their names\n",
    "- **Location nodes** capture geographic information with address components.\n",
    "\n",
    "Now we'll import our extracted contract data into Neo4j according to our defined graph model.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'_contains_updates': True, 'labels_added': 5, 'relationships_created': 4, 'nodes_created': 5, 'properties_set': 16}"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import_query = \"\"\"\n",
    "WITH $contract AS contract\n",
    "MERGE (c:Contract {path: $path})\n",
    "SET c += apoc.map.clean(contract, [\"parties\", \"agreement_date\", \"effective_date\", \"expiration_date\"], [])\n",
    "// Cast to date\n",
    "SET c.agreement_date = date(contract.agreement_date),\n",
    "    c.effective_date = date(contract.effective_date),\n",
    "    c.expiration_date = date(contract.expiration_date)\n",
    "\n",
    "// Create parties with their locations\n",
    "WITH c, contract\n",
    "UNWIND coalesce(contract.parties, []) AS party\n",
    "MERGE (p:Party {name: party.name})\n",
    "MERGE (c)-[:HAS_PARTY]->(p)\n",
    "\n",
    "// Create location nodes and link to parties\n",
    "WITH p, party\n",
    "WHERE party.location IS NOT NULL\n",
    "MERGE (p)-[:HAS_LOCATION]->(l:Location)\n",
    "SET l += party.location\n",
    "\"\"\"\n",
    "\n",
    "response = await neo4j_driver.execute_query(\n",
    "    import_query, contract=result.data, path=pdf_path\n",
    ")\n",
    "response.summary.counters"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Bringing it All Together in a Workflow\n",
    "\n",
    "Finally, we can combine all of this logic into one single executable agentic workflow. Let's make it so that the workflow can run by accepting a single PDF, adding new entries to our Neo4j graph each time."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "affiliate_extraction_agent = extractor.create_agent(\n",
    "    name=\"Affiliate_Extraction\",\n",
    "    data_schema=AffiliateAgreement,\n",
    "    config=ExtractConfig(\n",
    "        extraction_mode=ExtractMode.BALANCED,\n",
    "    ),\n",
    ")\n",
    "cobranding_extraction_agent = extractor.create_agent(\n",
    "    name=\"CoBranding_Extraction\",\n",
    "    data_schema=CoBrandingAgreement,\n",
    "    config=ExtractConfig(\n",
    "        extraction_mode=ExtractMode.BALANCED,\n",
    "    ),\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.workflow import (\n",
    "    Workflow,\n",
    "    Event,\n",
    "    Context,\n",
    "    StartEvent,\n",
    "    StopEvent,\n",
    "    step,\n",
    ")\n",
    "\n",
    "\n",
    "class ClassifyDocEvent(Event):\n",
    "    parsed_doc: str\n",
    "    pdf_path: str\n",
    "\n",
    "\n",
    "class ExtractAffiliate(Event):\n",
    "    file_path: str\n",
    "\n",
    "\n",
    "class ExtractCoBranding(Event):\n",
    "    file_path: str\n",
    "\n",
    "\n",
    "class BuildGraph(Event):\n",
    "    file_path: str\n",
    "    data: dict\n",
    "\n",
    "\n",
    "class KnowledgeGraphBuilder(Workflow):\n",
    "    def __init__(\n",
    "        self,\n",
    "        classifier: ClassifyClient,\n",
    "        rules: List[ClassifierRule],\n",
    "        affiliate_extract_agent: LlamaExtract,\n",
    "        cobranding_extract_agent: LlamaExtract,\n",
    "        **kwargs,\n",
    "    ):\n",
    "        super().__init__(**kwargs)\n",
    "        self.classifier = classifier\n",
    "        self.rules = rules\n",
    "        self.affiliate_extract_agent = affiliate_extract_agent\n",
    "        self.cobranding_extract_agent = cobranding_extract_agent\n",
    "\n",
    "    @step\n",
    "    async def classify_contract(\n",
    "        self, ctx: Context, ev: StartEvent\n",
    "    ) -> ExtractAffiliate | ExtractCoBranding | StopEvent:\n",
    "        result = await self.classifier.aclassify_file_path(\n",
    "            rules=self.rules, file_input_path=ev.pdf_path\n",
    "        )\n",
    "        contract_type = result.items[0].result.type\n",
    "        print(contract_type)\n",
    "        if contract_type == \"affiliate_agreements\":\n",
    "            return ExtractAffiliate(file_path=ev.pdf_path)\n",
    "        elif contract_type == \"co_branding\":\n",
    "            return ExtractCoBranding(file_path=ev.pdf_path)\n",
    "        else:\n",
    "            return StopEvent()\n",
    "\n",
    "    @step\n",
    "    async def extract_affiliate(\n",
    "        self, ctx: Context, ev: ExtractAffiliate\n",
    "    ) -> BuildGraph:\n",
    "        result = await self.affiliate_extract_agent.aextract(ev.file_path)\n",
    "        return BuildGraph(data=result.data, file_path=ev.file_path)\n",
    "\n",
    "    @step\n",
    "    async def extract_co_branding(\n",
    "        self, ctx: Context, ev: ExtractCoBranding\n",
    "    ) -> BuildGraph:\n",
    "        result = await self.cobranding_extract_agent.aextract(ev.file_path)\n",
    "        return BuildGraph(data=result.data, file_path=ev.file_path)\n",
    "\n",
    "    @step\n",
    "    async def build_graph(self, ctx: Context, ev: BuildGraph) -> StopEvent:\n",
    "        import_query = \"\"\"\n",
    "    WITH $contract AS contract\n",
    "    MERGE (c:Contract {path: $path})\n",
    "    SET c += apoc.map.clean(contract, [\"parties\", \"agreement_date\", \"effective_date\", \"expiration_date\"], [])\n",
    "    // Cast to date\n",
    "    SET c.agreement_date = date(contract.agreement_date),\n",
    "      c.effective_date = date(contract.effective_date),\n",
    "      c.expiration_date = date(contract.expiration_date)\n",
    "\n",
    "    // Create parties with their locations\n",
    "    WITH c, contract\n",
    "    UNWIND coalesce(contract.parties, []) AS party\n",
    "    MERGE (p:Party {name: party.name})\n",
    "    MERGE (c)-[:HAS_PARTY]->(p)\n",
    "\n",
    "    // Create location nodes and link to parties\n",
    "    WITH p, party\n",
    "    WHERE party.location IS NOT NULL\n",
    "    MERGE (p)-[:HAS_LOCATION]->(l:Location)\n",
    "    SET l += party.location\n",
    "    \"\"\"\n",
    "        response = await neo4j_driver.execute_query(\n",
    "            import_query, contract=ev.data, path=ev.file_path\n",
    "        )\n",
    "        return StopEvent(response.summary.counters)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "knowledge_graph_builder = KnowledgeGraphBuilder(\n",
    "    classifier=classifier,\n",
    "    rules=rules,\n",
    "    affiliate_extract_agent=affiliate_extraction_agent,\n",
    "    cobranding_extract_agent=cobranding_extraction_agent,\n",
    "    timeout=None,\n",
    "    verbose=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Running step classify_contract\n",
      "affiliate_agreements\n",
      "Step classify_contract produced event ExtractAffiliate\n",
      "Running step extract_affiliate\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Uploading files: 100%|██████████| 1/1 [00:00<00:00,  2.84it/s]\n",
      "Creating extraction jobs: 100%|██████████| 1/1 [00:00<00:00,  1.68it/s]\n",
      "Extracting files: 100%|██████████| 1/1 [00:19<00:00, 19.04s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Step extract_affiliate produced event StopEvent\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    }
   ],
   "source": [
    "response = await knowledge_graph_builder.run(\n",
    "    pdf_path=\"CybergyHoldingsInc_Affliate Agreement.pdf\"\n",
    ")"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
